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Background: The distribution of chemical species in an open system at metastable equilibrium can be expressed 
as a function of environmental variables which can include temperature, oxidation-reduction potential and others. 
Calculations of metastable equilibrium for various model systems were used to characterize chemical transformations 
among proteins and groups of proteins found in different compartments of yeast cells. 

Results: With increasing oxygen fugacity, the relative metastability fields of model proteins (including iso- 
forms of glutaredoxin and thioredoxin, and compartmental proteomes) for major subcellular compartments go as 
mitochondrion, endoplasmic reticulum, cytoplasm, nucleus. Compared with experimental determination of redox 
potential (Eh) in these compartments, the order of the endoplasmic reticulum and nucleus is swapped. In a 
metastable equilibrium setting at relatively high oxygen fugacity, proteins making up actin are predominant, but 
those constituting the microtubule occur with a low chemical activity. Nevertheless, interactions of the micro- 
tubule with other subcellular compartments are essential in cell development. A reaction sequence involving the 
microtubule and spindle pole proteins was predicted by combining the known intercompartmental interactions 
with a hypothetical program of oxygen fugacity changes in the local environment. In further calculations, the 
most-abundant proteins within compartments generally occur in relative abundances that only weakly correspond 
to a metastable equilibrium distribution. However, physiological populations of proteins that form complexes often 
show an overall positive or negative correlation with the relative abundances of proteins in metastable assemblages. 

Conclusions: This study explored the outlines of a thermodynamic description of chemical transformations 
among interacting proteins in yeast cells. Full correspondence of the model with biochemical and proteomic 
observations was not apparent, but the results suggest that these methods can be used to measure the degree of 
departure of a natural biochemical process or population from a local minimum in Gibbs energy. 

Author Summary 

Part of a cell's expenditure of metabolic fuel is directed toward the formation of proteins, including their synthesis 
and transport to other compartments. Even when it is normalized to the lengths of the proteins, the energy 
required for protein formation is not a constant, but depends on the composition and environment of the protein. 
If these energy differences are quantified, the relative abundances of model proteins in metastable equilibrium can 
be calculated. The compositions of these metastable assemblages depend on local environmental variables such as 
oxygen fugacity, which is a scale for oxidation-reduction potential in a system. I calculate the oxygen fugacities for 
equal chemical activities of model proteins in intercompartmental interactions and use the results to obtain model 
values of oxygen fugacity for subcellular compartments. I show that a environmental gradient of oxygen fugacity 
can potentially drive the formation of proteins in a sequential order determined by their chemical compositions 
and Gibbs energies. I also show that the relative abundances of proteins within compartments and of those that 
form complexes have a dynamic range that can be approximated in some metastable equilibrium assemblages. 
These results provide theoretical constraints on the natural emergence of spatial and temporal patterns in the 
distributions of proteins and imply that work done by maintaining oxidation-reduction gradients can selectively 
alter the degree of formation of proteins and complexes. 



Introduction 



Subcellular compartmentation is a basic feature of eukaryotic life [TJ [2] [31 0] . There exist in eukaryotic cells gradients 
between subcellular compartments of chemical properties such as pH [5] [51 [7J [5J , oxidation-reduction or redox state 
[9] [TO] [11] [12] and chemical activity of water [13] [14] [15] [16]. Furthermore, the proteins required by yeast and 
other organisms are unevenly localized throughout cells [17] [T8] [T9] [20] [21] [22] [23] . Even within compartments or 
among the proteins that interact to form complexes, the relative abundances or levels of different proteins are not 
equal [24] [25] [26] , and different proteins predominate in the various subcellular populations depending on growth 
state of the cell [27, 28J, and exposure to environmental stress [291 [3171 I3T1 152*] . 

Much attention has been given to the use of thermodynamics in describing and understanding driving forces in 
biological evolution. Energy minimization imparts a direction for spontaneous change of a system, and response of a 
system in this direction can at times be tied to an increase in relative fitness [33, 34, 35 36 37j. A biological system 
that moves away from minimum energy does not break the laws of thermodynamics but couples its endergonic 
reactions with the exchange of matter and energy in its surroundings [33J [35] [40] [4T]. The thermodynamic 
characteristics of open systems are thus of particular interest to biological evolution [42] [43] 144) ; in particular, 
the interactions of organisms with their environments are important influences on the stable compositions and 
distributions of genes or organisms [451 l46l 1471 148j . 

Why are proteins not equally distributed inside cells? Physical separation of key enzymes is thought to be 
essential in the cytoskeletal network and in regulation of metabolic pathways and other cellular functions [49l l50l 
151] , The patterns of subcellular structure persist even though populations of proteins turnover through continual 
degradation and synthesis in cells [52] [53] [54] [55] , and despite the endergonic, or energy-consuming, qualities of 
protein biogenesis |56] [57] . It can be shown that the relative abundances of amino acids in proteins correlate 
inversely with the metabolic cost of synthesis of the amino acid [58] 159] , which is a temperature-dependent 
function [60J. The starting premise of this study, then, is that protein formation reactions are unfavorable to 
different degrees, depending on the environments and compositions of the biomolecules. 

The application of equilibrium chemical thermodynamics as a way to characterize the relative stabilities of 
minerals as a function of temperature, pressure and oxidation-reduction potential [61] [62] [63] , or to calculate the 
relative abundances of coexisting inorganic [64] [65] and/or organic species [41] [66], is well documented in the 
geochemical literature. An advantage of performing quantitative chemical thermodynamic calculations for many 
different model systems is that the equilibrium state serves as a frame of reference for describing both reversible 
and irreversible chemical changes. For example, the weathering of igneous rocks is an overall irreversible process 
but the sequences of minerals formed can nevertheless be predicted after initial formulation of the relative stability 
limits of the chemical species involved [67, 68J. One of the motivations for this study is to see whether a similar 
approach could be used to describe the sequence of events in irreversible subcellular processes. 

The thermodynamic calculations reported in this study are based on algorithms for calculating the standard 
molal Gibbs energies of ionized proteins [69J and a chemical reaction framework that is used to compute metastable 
equilibrium relative abundances of proteins [70J. The Supporting Information for this paper includes the software 
package (Text SI) and the program script and data files (Text S2) used to carry out these calculations. The 
theoretical approach adopted here is based on the description of a chemical system in terms of intensive variables. 
These variables are temperature, pressure and the chemical potentials of the system. It is convenient to denote the 
chemical potentials by the chemical activities or fugacities of basis species, for example the activity of H + (which 
defines pH) or the fugacity of oxygen. This permits comparison of the parameters of the model with reference 
systems described in experimental and other theoretical biochemical studies. 

A few notes on terminology follow. Formation of a protein refers to the overall process of protein biosynthesis 
and translocation to a specific compartment. Activity and species denote, respectively, chemical activity and 
chemical species, not enzyme activity or biological species. In the present study, activity coefficients are taken to 
be unity, so the chemical activities are equivalent to molal concentrations. Below, oxidation-reduction potential 
and oxygen fugacity are used synonymously, and redox refers specifically to Eh. The oxidation-reduction potential 
of a system can be expressed in terms of Eh using an equation given in the Methods. The overall composi- 
tions of proteins in compartments are referred to here as proteologs (or model proteologs). The interactions of 
proteins are processes in which the proteins come into physical contact, for example in transport processes be- 
tween compartments and in the formation of complexes. If a process results in a change in the composition of 
a population of interacting proteins, then a chemical reaction has occurred. Protein- protein interactions do not 
necessarily correspond to chemical reactions. However, a population of interacting proteins does chemically react 
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a. Amino acid compositions of subcellular isoforms of glutaredoxin (GLRX), thioredoxin (TRX) and thioredoxin 
reductase (TRXB) in 5. cerevisiae were taken from the SWISS-PROT database [71J (accession numbers shown 
in the table). Chemical formulas of nonionized proteins, and calculated standard molal Gibbs energy of formation 
from the elements (AG°, in kcal mol -1 , at 25 °C and 1 bar) and net ionization state (Z) at pH = 7 of charged 



proteins are listed. Average nominal oxidation state of carbon {Zq) was calculated using Eqn. (12) 



if the turnover rates of the proteins are not all the same or if, through evolution, the genes coding for the proteins 
undergo different non-synonymous mutations. Model systems consisting of interacting proteins are useful targets 
for assessing the potential for chemical reactivity, which might occur on evolutionary time scales longer than the 
physical interactions. 

The purpose of this study is to quantify using a metastable equilibrium reference state the responses of popu- 
lations of model proteins for different subcellular compartments of S. cerevisiae to gradients of oxidation-reduction 
potential. There are two major parts to this paper. In the first part, the reactions corresponding to intercompart- 
mental interactions between isoforms (or homologs) of particular enzymes and between proteologs are quantified 
by calculating the oxygen fugacities for equal chemical activities of the reacting proteins or proteologs in metastable 
equilibrium. A ranking of relative metastabilities of the proteologs is discussed. Specific known interactions be- 
tween compartments are considered in order to derive values of the oxygen fugacity within compartments that 
best metastabilize the corresponding proteologs relative to those of other compartments. Equal-activity values of 
the oxygen fugacity in the reactions are used to predict a sequence of formation of model proteologs in response 
to a temporal oxidation-reduction gradient. 

In the second part of this paper, the relative abundances of model proteins in metastable equilibrium are 
calculated and compared with measured abundances. The range of protein abundances in a metastable equilibrium 
population often approaches that seen in experiments over a narrow window of oxygen fugacity. Positive and 
negative correlations between the calculated and experimental relative abundances are found in some cases. Local 
energy minimization and its opposition in the cellular demands for selectivity in protein formation are discussed as 
possible processes leading to the observed patterns. 

Results and Discussion 



Calculated metastability relations are described below for intercompartmental interactions between the model 
homologs and proteologs, and for intracompartmental interactions among the most abundant proteins in com- 
partments or the reference model complexes. Experimental comparisons and discussion of their implications are 
integrated with these results. 
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Relative metastabilities of subcellular homologs of redoxins 

The cytoplasmic, nuclear and mitochondrial homologs of glutaredoxin [72l [73] El] and thioredoxin/thioredoxin 
reductase [75] [IT] in yeast cells represent the first model systems for subcellular environments studied here. The 
names and chemical formulas of these proteins are listed in Table[T] together with some computed properties. The 
average nominal oxidation state of carbon (Zq) is a function of the relative proportions of the elements in the 
chemical formula (see Methods). These values are provided just to get some initial bearing on the differences in 
compositions of the proteins. In TableJIJthe proteins with the lowest values of Zc are the mitochondrial homologs 
and those with the highest values of Zc are the nuclear homologs. 

Because the current objective is to de- 
scribe the compositions of populations of 
proteins in terms of a variable like oxidation- 
reduction potential, a quantity such as Zc 
is not sufficient; it has no explicitly deriv- 
able relation to intensive properties that can 
be measured. The forces acting on chem- 
ical transformations among proteins can, 
however, be assessed by first writing chem- 
ical reactions denoting their formation. An 
example of this procedure is given further 
below for a specific model system. The 
basic methods that apply there were used 
throughout this study. The standard mo- 
lal Gibbs energies (AG°) and net charges 
of ionized proteins at pH = 7 are listed in 
Table [T] so that the results described below 
can be reproduced at this pH. 

In Figs. [ij and b the metastable 
equilibrium predominance limits of ionized 
proteins in the glutaredoxin and thiore- 
doxin/thioredoxin reductase model systems 
are shown as a function of the logarithm 
of oxygen fugacity and pH. Here, the pre- 
dominant protein in a population is taken 
to be the one with the greatest chemical 
activity. The computation of the relative 
metastabilities of the proteins included all 
five model proteins in the glutaredoxin sys- 
tem as candidates, but note regarding Fig. 
[T^ that only two of the five proteins appear 
on the diagram. Those that do not appear 
are less metastable, or have greater energy 
requirements for their formation over the 
range of conditions represented in Fig. [1J 
than either of the proteins appearing in the 
figure. 

The equal-activity lines in these pH di- 
agrams are curved because the ionization 

states of the proteins depend on pH. The observation apparent in Fig. jl^ that increasing log/o 2(s) favors 
formation of the cytoplasmic protein homolog relative to its mitochondrial counterpart is also true for the thiore- 
doxin/thioredoxin reductase system shown in Fig. [TJd. In comparing Figs. [T^ and b note that in the latter 
figure, predominance fields for a greater number of candidate proteins appear, and that the predominance field 
boundary between mitochondrial and cytoplasmic proteins occurs at a lower oxidation-reduction potential. The 
dashed lines shown in each diagram of Fig. [I] are reference lines denoting the reduction stability limit of H2O 
(log/o 2(9) « -83.1 at 25 °C and 1 bar [76]). 
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Figure 1: Relative metastabilities of homologs of glutaredoxin 
and thioredoxin/thioredoxin reductase. Predominance diagrams 
were generated for homologs of (a,c,e) glutaredoxin and of (b,d,f) 
thioredoxin/thioredoxin reductase in S. cerevisiae. The letters in 
parentheses following the labels indicate the subcellular compartment 
to which the protein is localized (C - cytoplasm; M - mitochondrion; 
N - nucleus). Calculations were performed for ionized proteins at 25 
°C and 1 bar and for reference activities of basis species noted in 
the Methods. Reduction stability limits of H2O are shown by dashed 
lines; the dotted lines in (c) and (d) correspond to the plot limits of 
(a) and (fa). 
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Predominance diagrams as a function of Eh and pH for the glutaredoxin and thioredoxin/thioredoxin reductase 
systems are shown in Figs. [TJ: and d. Like log/o 2(g) , Eh and pH together are a measure of the oxidation-reduction 
potential of the system; the different scales can be converted using Eqn. (11). The trapezoidal areas bounded 
by dotted lines in Figs. [TJ; and d show the ranges of Eh and pH corresponding to the log/o 2(s ^-pH diagrams of 
Figs, fit and b. It can be deduced from these diagrams that if the upper log/o 2(g) limit of Fig. it were extended 



upwara, this diagram would include a portion of the predominance field for the nuclear protein GLRX3. 

It appears from Figs, [T^-b that increasing increasing log/o 2(g) at constant pH, or increasing pH at constant 
oxidation-reduction potential have similar consequences for the relative metastabilities of the cytoplasmic and 
mitochondrial homologs. In this analysis, however, pH does not appear to be a very descriptive variable; the 
magnitude of the effect of changing oxygen fugacity over several log units is greater than the effect of changing 
pH by several units. In further metastability calculations pH was set to 7. Also, because Eh itself is defined in 
terms of pH, the oxidation-reduction potential variable adopted below is log/o 2(s) , which is more directly related 
to the potential of a thermodynamic component. 

In Figs, [ij and f the logarithm of activity of water (logaH 2 o) appears as a variable. In Fig. [ij it can be seen 
that the formation of a nuclear homolog of GLRX is favored relative to the cytoplasmic homologs by decreasing 
activity of water and/or increasing oxygen fugacity, and that increasing relative metastabilities of the mitochondrial 
proteins are consistent with lower oxidation-reduction potentials and to some extent higher activities of water. In 
Fig. [TJ it appears that the formation of the thioredoxin reductases relative to thioredoxins in each compartment 
is favored by increasing /o 2(g) > and that for the TRX the relative metastabilities of the mitochondrial proteins 
increase with decreasing /o 2(g) - 



Comparison with subcellular redox measurements 



Let us compare the positions of the 
predominance fields in Fig. [TJ with 
measured subcellular redox states. 
The values of Eh derived from the 
concentrations of oxidized and re- 
duced glutathione (GSSG and GSH, 
respectively) in extra- and subcel- 
lular environments reported in vari- 
ous studies [S EH [H |H EE] were 
converted to corresponding values of 
lo g/o 2(s) using Eqn. (JTlJ in the 
Methods and are listed i n lable [2j In 
order to fill in the table as completely 
as possible, it was necessary to con- 
sider measurements performed on eu- 
karyotic cells other than those of S. 
cerevisiae (e.g., HeLa [80] and mouse 
hybridoma [81] cells). The values of 
pH required for conversion of Eh to 



Table 2: Nominal electrochemical characteristics of subcellular environ- 
ments in eukaryotes. Values refer to yeast cells unless noted otherwise. 

Environment Eh, volt pH log/o 2 ^ g j m 



Extracellular (intestine) 


-0.137 to 


-0.80 a 


3£ 


-83.3 


to - 
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Cytoplasm 


-0.235 to - 
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6.5 h 
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to - 
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Nucleus 




_c 


7.7 1 






_C 


Mitochondrion 




-0.360 d 


8J 






-78.3 


Endoplasmic reticulum 


-0.185 to - 


-0.133 e 


7.2 k 


-69.7 


to - 


-66.2 


Vacuole 


> +0.769 f 


6.2 1 




> 


-9.2 



a. EZ] (Homo sapiens), b. The lower and upper values are taken from 
[78J and [79], respectively, c. The state of the GSSG/GSH couple in the 
nucleus is thought to be more reduced than in the cytoplasm |4j; see text, 
d. [10] (Homo sapiens HeLa [80] cells), e. [9J (Mus musculus: mouse 
hybridoma cells |81J). f. Calculated by combining the law of mass action 
for Fe +3 + eT ^ Fe +2 (standard molal Gibbs energies taken from [82] ) 
with of c + 3 = flFc+ 2 ( see text), g. [83J (Homo sapiens), h. [6J (yeast), i. 
[84] (organism unspecified), j. E] (HeLa) k. [8]. I. [5]. m. Values of Eh 
and pH listed here were combined with Eqn. (JTTJ) at T — 25 °C, P = 1 
bar and an 2 o = 1 to generate the values of log /o 2(s) ■ 



log /o 2 ( S ) were also retrieved from the 
literature [531 ED EJ- The computation 
of log/o 2(g) from Eh was performed 
at 25 °C and 1 bar and with logan 2 o = 0. No measurements of vacuolar Eh have been reported, but it has 
been noted that Fc +3 predominates over Fe +2 in this compartment [85]. Hence, a nominal (and relatively very 
oxidizing) value of Eh for the vacuole was calculated that corresponds to equal activities of Fe +3 and Fe +2 . 

The available measurements of redox states in compartments of eukaryotic cells can be summarized as, from 
most reducing to most oxidizing, mitochondria - nucleus - cytoplasm - endoplasmic reticulum - extracellular [4]. 
Strong redox gradients within the mitochondrion are essential to its function [86J, which is not captured by the 
single values listed in Table [2] Comparison nevertheless with the computational results shown in Fig. [TJ indicates 
that a relatively reducing environment does metastably favor the mitochondrial homolog. 

Measurements of GSH/GSSG concentrations point to a lower redox state in the nucleus than in the cytoplasm, 



5 



Table 3: Overall protein compositions (proteologs) of compartments in yeast cells 8 
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131 
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6.8 
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156 
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46 


815.6 


C4110.6H6516.1N1091.9O1272.1S20.6 


-33153 


-11.5 


-0 


159 


-75.2 


nucleolus 


60 


564.3 


C2788.9H4430.5N771.5O899.0S13.3 


-23928 


-10.9 


-0 


104 


-75.0 


nucleus 


453 


572.1 


C2843.6H4542.8N802.2O893.6S20.3 
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-2.3 


-0 
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-71.5 
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-20313 
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a. Chemical formulas of nonionized proteologs and standard molal Gibbs energy of formation from the elements 
(AG°, in kcal mol -1 , at 25 °C and 1 bar) and net ionization state (Z) at pH = 7 of ionized proteologs were 
calculated using the overall amino acid compositions given in Table SI. Values of the nominal oxidation state 
of carbon (Zq) were calculated using Eqn. (12). log/o 2( . values for compartments were determined from the 
metastable equilibrium limits of subcellular interactions listed in Table|4] 



but the chemical thermodynamic predictions show the nuclear proteins favored by relatively oxidizing conditions. 
Studies using nuclear magnetic resonance (NMR) showing that the hydration state of the nucleus is higher than 
the cytoplasm [16] [13] bring into question the prediction consistent with Fig. [l^ that the formation of the nuclear 
proteins is favored relative to their cytoplasmic counterparts by decreasing activity of water. Also, mitochondrial 
pH is somewhat higher than that of the cytoplasm [6] [7], but in Figs. [T^ and b it appears that the predicted 
energetic constraints favor the cytoplasmic proteins at higher pHs. These comparisons indicate that all metastable 
equilibrium constraints are not preserved in the spatial relationships of the homologous redoxins in the cell. 

Relative metastabilities of proteologs 

The chemical formulas and thermodynamic properties of the model proteologs - hypothetical proteins representing 
the overall amino acid compositions of compartments (see Methods) - are listed in Table [3] The predominance 
diagrams in Fig. [2] depicting the relative metastabilities of the model proteologs as a function of log /o 2(9) and 
logaH 2 o were generated in sequential order. The first diagram in this figure corresponds to a system in which all 
23 proteologs were considered. Subsequent diagrams in Fig. [2] were generated by eliminating from consideration 
some or all of the proteologs represented by predominance fields in the immediately preceding diagram. It can 
be seen in Fig. [2^ that consideration of 23 proteologs resulted in predicted predominance fields for six proteins 
over the ranges of log/o 2(g) and logan 2 o shown in the diagram. Subsequent diagrams in the sequence represent 
proteologs with lower predicted relative metastabilities, i.e., higher energy requirements for formation relative to 
proteologs appearing earlier in the sequence. 

There is a large difference between the relatively oxidized conditions of the endoplasmic reticulum reported 
in the literature (see Table [2]) and the theoretically relatively reduced environment of the ER proteolog shown in 
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Fig. [2^. Also note the average nominal carbon oxidation state of the ER proteolog, which is the lowest of any in 
Table [3] A possible interpretation of these observations is that there is significant chemical heterogeneity within 
this compartment and a relatively high energy demand for the formation of these proteins in the oxidizing spaces. 
Nevertheless, the juxtaposition in the ER of very reduced proteins and high redox potential does permit a possible 
advantage: If the redox potential of the compartment were much lower, the proteins constituting the endoplasmic 
reticulum would become more favorable to produce than any other proteins (see below) ultimately localized to 
other compartments that are initially produced there. Perhaps in this way a high redox state could signal the 
production of cytoplasmic and secreted proteins and a drop in redox state the production of biosynthetic enzymes, 
i.e. the reproduction of the ER itself. 

The proteologs appearing 
in successive diagrams in Fig. 
[2]are characterized by increas- 
ingly higher predicted energy 
requirements for their for- 
mation. Hence, the nu- 
clear, cytoplasmic and mito- 
chondrial proteologs appear- 
ing in Fig. [2]:-d are rela- 
tively less metastable com- 
pared to those of actin, early 
Golgi and ER appearing in 
Fig. [2^. It is noteworthy 
that the proteologs represent- 
ing the two cytoskeletal sys- 
tems in yeast cells, actin and 
microtubule, appear at oppo- 
site ends of the energy spec- 
trum. This prediction may 
be consistent with the ob- 
servation that actin in dif- 
ferent forms appears to be 
present at most stages of 
the cell cycle [57], but that 
the microtubule cytoskeleton 
grows during anaphase (i.e., 
the stage of the cell cycle 
characterized by physical sep- 
aration of the chromosomes; 
[88J) and is degraded during 
other stages of the cell cycle 
HU [88]. 

The order of appearance 
of phases throughout a reac- 
tion sequence is determined 
by the relative stabilities of 
the phases [63] , Examples 
of the application of this no- 
tion in inorganic systems are 
the reaction series of meta- 

morphic minerals, paragenetic sequences of mineralization [89J, Ostwald ripening [90J, and weathering reaction 
paths [91J. Can the relative metastabilities of proteins provide information about their order of appearance in the 
cell cycle? 

The outcome of the mitotic cycle in S. cerevisiae is the growth of a new cell in the form of a bud [88J. Not 
all structures in the bud form simultaneously. Instead, it has been observed that [92J "the endoplasmic reticulum, 
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Figure 2: Relative metastabilities of proteologs of compartments. Predomi- 
nance diagrams were generated as a function of log/o 2(g) and logan 2 o at 25 °C 
and 1 bar for the proteologs listed in Table [3] The diagram in (a) represents 23 
model proteologs; diagrams in panels (b)-(f) represent successively fewer model 
proteologs. 
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Table 4: Major intercompartmental protein interactions in yeast a . 



Interaction 


Arao 2 


l °g/o 2(g) 


Interaction 


An 02 


l°g/o 2(g) 


actin-bud 


0.266 


-75.1 


vacuole-bud 


0.223 


-73.8 


actin— bud. neck 


0.078 


-83.4 


vacuole-cell. periphery 


0.140 


-73.4 


actin-cell. periphery 


0.183 


-75.4 


vacuole-cytoplasm 


0.044 


-70.7 


actin-endosome 


0.129 


-76.6 


vacuole-endosome 


0.086 


-74.1 


actin-vacuolar.membrane 


0.111 


-75.1 


vacuole-late.Golgi 


0.072 


-71.5 


actin-mitochondrion 


0.161 


-75.8 


nucleus— actin 


-0.023 


-88.7 


actin— microtu bu le 


0.124 


-78.3 


nucleus-microtubule 


0.101 


-75.9 


microtubule— bud 


0.142 


-72.3 


nucleus-spindle. pole 


0.139 


-75.5 


microtubule-bud.neck 


-0.045 


-69.4 


nucleus-bud 


0.243 


-73.8 


microtubule— cell. periphery 


0.059 


-69.2 


nucleus-bud. neck 


0.056 


-81.3 


microtubule— cytoplasm 


-0.037 


-83.3 


nucleus-cytoplasm 


0.064 


-71.7 


microtubule— spindle, pole 


0.038 


-74.3 


nucleus— nucleolus 


-0.034 


-78.7 


spindle. pole-cytoplasm 


-0.075 


-78.7 


nuclear. periphery-bud. neck 


-0.080 


-69.3 


spindle, pole— nuclear. periphery 


-0.004 


-119.1 


nuclear, periphery-cytoplasm 


-0.072 


-76.5 


ER-cell. periphery 


-0.460 


-74.7 


nuclear. periphery-nucleus 


-0.136 


-74.2 


ER-cytoplasm 


-0.557 


-74.7 


nuclear. periphery-nucleolus 


-0.169 


-75.1 


ER-early.Golgi 


-0.345 


-75.4 


peroxisome— cell, periphery 


0.062 


-78.7 


ER— nuclear, periphery 


-0.485 


-74.4 


peroxisome— cytoplasm 


-0.034 


-67.2 


ER-peroxisome 


-0.522 


-75.2 


peroxisome-lipid. particle 


0.138 


-74.9 


Golgi-endosome 


-0.199 


-74.3 


peroxisome— mitochondrion 


0.040 


-82.6 


Golgi-vacuole 


-0.285 


-74.2 


mitochondrion— cell. periphery 


0.023 


-72.0 


Golgi-latc.Golgi 


-0.213 


-75.2 


mitochondrion-cytoplasm 


-0.074 


-75.4 


Golgi-early.Golgi 


-0.030 


-84.4 


mitochondrion-nucleus 


-0.138 


-73.7 



a. Interactions between proteins in different subcellular locations in S. cerevisiae were identified in the literature. 
The calculated reaction coefficients on 02(g) and the metastable equilibrium value of log/o 2(g) were calculated 
for each reaction between model proteologs. Names of locations shown in bold indicate that the model value of 
log/o 2(g) for this compartment (Tablej3j) lies in the metastability range for the proteolog in the particular reaction. 

Golgi, mitochondria, and vacuoles all begin to populate the bud well before anaphase and that their segregation 
into the bud does not require microtubules". The results in Fig. [2] indicate that the proteolog for bud is of 
comparable metastability relative to that of Golgi but it less metastable than the proteolog of ER. In the absence 
of energy input, it follows that there would be a chemical driving force to form the ER proteins at the expense 
of any of the bud that may be present. The appearance in the bud of the less-metastable mitochondrial proteins 
suggests that there is a source of energy to the bud that is nevertheless not sufficient to drive the formation of 
the proteins in the microtubule. The formation of these proteins may not be possible until the products of the 
mitochondrial reactions and other energy-rich metabolites have accumulated in the cell. 

Intercompartmental protein interactions 

The diagrams in Fig. [2] show the predominant metastability interactions between proteologs for different subcel- 
lular compartments. However, many subcellular interactions may in fact be meta-metastable with respect to Fig. 
[2j For example, interactions occur between proteins in the cytoplasm and nucleus [93J, but the proteologs for 
these compartments do not share a reaction boundary in Fig. I2fc. Below, known intercompartmental interactions 
are combined with the oxygen fugacity requirements for (meta-)metastable equilibrium of the proteologs to char- 
acterize compartmental oxidation-reduction potentials. These are used in the next section to explore a possible 
developmental reaction path. 

To assess the biochemical evidence for specific interactions between proteins in different compartments in yeast 
cells, a series of review papers was surveyed [871 [23 ISHJ [93l [96] [97] . Statements implying interaction between 
proteins in different compartments were identified by scanning for action words including interact, are at, align, end 
at, organize, embed, move, associate, found, locate, extend, bisect, move, migrate, enter, attach, translocate, carry, 
sort, composed of, line, dock and fuse, recycle, transport, pinch, proceed, reach, degrade in, deliver, colocalize, 
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Figure 3: Logarithms of oxygen fugacity for equal chemical activities of proteologs in intercompartmental 
interactions. Metastable equilibrium values of log/o 2(s) were obtained for the model reactions listed in Table|4] 
Reactions are grouped by a common proteolog, listed along the bottom of the plot. Reactions that were used to 
derive model values of oxygen fugacity of compartments listed in Tableware denoted by arrows and bold lines 
and labels. The position of the reaction labels denotes the direction of the reaction that favors formation of the 
corresponding proteolog. The actin-bud and ER-cell periphery interactions were omitted from this plot to aid in 
clarity of labeling; they overlap with actin-vacuolar membrane and ER-cytoplasm, respectively. 



contain, associate, separate, protrude, penetrate, cooperate, crosstalk, anchor, reside, continuous with, shuttle, 
oxidize, essential to, convey, arrange, import, and transcribe. The source statements are listed in Text S3 and 
simplified pairwise representations of the interactions are summarized in Table [4] Of 190 possible combinations 
between any two of the 20 subcellular compartments (this count excludes the ambiguous location and ER to Golgi 
and punctate composite, which did not appear in the literature survey), 46 interactions were identified through 
this survey. 

Chemical reactions corresponding to each of the interactions listed in Table [4] were written between residue 
equivalents of the proteologs, with the reactant proteolog being the one on the left-hand side of the interaction and 
the product proteolog the one on the right-hand side. The reactions are listed in Table S2. Corresponding values 
of Ano 2(g ) (reaction coefficient on C>2( g )) are listed in Table |4] together with the values of log/o 2(g) where the 
calculated chemical activities of the two proteologs in each reaction are equal. Note that there are some reactions 
where the absolute value of Ano, (g) is substantially smaller than the others; these include spindle pole-nuclear 
periphery, Golgi-early Golgi and nucleus-actin. Because of the small value of Arto 2(g) in these reactions, the values 
of log/o 2(g) for equal activities of these proteins tend to be more extreme than for other reactions. Note that the 
sign of Ano 2(g) denotes the thermodynamically favored direction of the reaction as log/o 2(g) is changed from its 
equal-activity value; for example, at log/o 2(s) = —75.1, the proteologs of actin and bud metastably coexist with 
equal chemical activities, but at higher values that of actin predominates in metastable equilibrium. 

The interactions listed in Table|4]were used to generate model values of the oxygen fugacity in each compart- 
ment that are listed in Table|3] The criterion used for this analysis was that the oxygen fugacity in a compartment 
should in as many cases as possible favor the formation of its proteolog relative to those of interacting compart- 
ments. For example, consider the proteolog for endosome, which occurs in three interactions listed in Table [4] 
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The endosomal proteolog is favored to form relative to that of actin by log/o 2(g) < —76.6 and relative to that of 
vacuole by log/o 2 , < —74.1. In contrast, the endosomal proteolog is favored to form relative to the proteolog 
of Golgi by log/o 2 ( g) > —74.3. A single value of log/o 2(g) can satisfy at most two of these constraints; the 
model value for endosome is taken to be just below the limit for its interaction with actin, or log/o 2(s) = —76.7 
(Table [3]). Because this value favors formation of the endosomal proteolog relative to those of actin and vacuole, 
the proteolog of endosome is listed in bold font in these interactions in Table [4] but is shown in normal font in 
the interaction with the Golgi proteolog. Similar reasoning was used to derive oxygen fugacities for the other 
subcellular compartments listed in Table [3] except for microtubule. 

The outcome of the above analysis is summarized in Fig. [5J where the values of log/o 2(g) for interactions that 
fall between —79 and —71 are plotted. The interactions are grouped by a common interacting proteolog so that 
differences between them can be more easily visualized. To avoid clutter, the reaction labels are generally restricted 
to the name of a single proteolog to indicate the direction of log/o 2(g) change that favors its formation in the 
reaction. Model interactions that were used to constrain the limits of oxygen fugacities for one compartment (such 
as the actin-endosome interaction noted above) or two compartments (such as Golgi— late Golgi) are identified 
with one or two arrows, respectively, and the names of the corresponding proteologs are shown in bold font. 

If the model compartmental values oflog/o 2( . all favored formation of the corresponding proteologs relative to 
their interacting partners, the name of every proteolog would appear in bold font in Table|4] This is only the case, 
however, for some proteologs such as that of actin, where log /o 2{s) — 75 favors formation of this proteolog relative 
to any of its interacting partners. At the same oxygen fugacity, it can be shown that the proteolog for microtubule 
is unmetastable with respect to any of its interacting partners except for bud neck. Notably, the proteolog for 
microtubule only becomes relatively metastable at high oxygen fugacities (w.r.t. bud, cell periphery and spindle 
pole) or at low oxygen fugacities (w.r.t. actin, cytoplasm and nucleus). Hence, the value of log /o 2(g) — 75 taken 
here for the microtubule compartment is different from all the others, in that this represents conditions where the 
formation of its proteolog is more unfavorable than that of any of its interacting partners. 



Sequential formation driven by oxygen fugacity gradients 

We have already seen theoretical evidence that the microtubule is a relatively unmetastable assemblage of proteins 
in the cell. It is known in spite of this that the microtubule as well as the spindle pole are essential in cellular 
division |87j. Can the metastable equilibrium relationships reveal anything about the origins of the interactions 
of the microtubule and spindle pole in this process? The following thought experiment explores why the irre- 
versible formation of proteologs might follow a sequence that is related to metastable equilibrium thermodynamic 
relationships. 

To start, consider a permeable sac consisting of the cytoplasmic proteolog, which we will expose to a chang- 
ing oxidation-reduction environment. The oxidation-reduction program will begin at log/o 2(g) = —75, drop to 
log/o 2(g) = —83.5, increase to log/o 2(g) = —69 and return to log/o 2(a) = —75. At any point along this program 
the only reactions we will consider are those involving the proteologs of microtubule or spindle pole. Let us assume 
in addition that none of these reactions proceeds to completion, and that any reaction may only proceed while 
log/o 2 , is near the equal-activity value for the reaction. Keeping in mind that no mechanism for the reactions 
is implied here, it may still be worthwhile to note that others have observed near-equilibrium concentrations of 
substrates in a subset of enzymatically catalyzed reactions [98l 199] . 

At log/o 2(g) = —75, no reaction occurs because the conditions coincide with the metastability field of the 
cytoplasmic proteolog relative to either microtubule or spindle pole. As soon as the log/o 2(g) decreases below 
—78.7, some of the spindle pole proteolog may form irreversibly at the expense of the cytoplasmic proteolog. Below 
log/o 2 , = —83.3, the microtubular proteolog can begin to form at the expense of the cytoplasmic proteolog. 
At log/o 2(g) = —83.5 both of these reactions may favorably proceed, and we begin now to increase log/o 2(g) - 
As we pass log/o 2(g) = —83.3, then log/o 2(g) = —78.7 going in the positive direction, some of the proteolog of 
microtubule, then spindle pole can react irreversibly to form the cytoplasmic proteolog. These are the opposite of 
the first two irreversible reactions. 

As long as the current and following reactions do not proceed to completion, there will be a population of 
the microtubule and spindle pole proteologs available to react. Above log/o 2(g) = —78.7, where the formation of 
the cytoplasmic proteolog becomes favored relative to spindle pole (see above), the proteolog of actin may also 
favorably form at the expense of that of microtubule. The nuclear proteolog can form above log/o 2(g) = —75.9 
at the expense of the microtubular proteolog, and above log/o 2( = —75.5 at the expense of the spindle pole 
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Table 5: Hypothetical oxygen fugacity cycle and sequence of reactions of proteologs. 



lo g/o 2(g) 


Reaction 


log /02(g) 


Reaction 


-75.0 


Begin 


-74.3 


-ii 1 > 1 1 
spindle, pole^microtubule 


"7Q "7 
- 10. 1 


cytoplasm — >spindle.pole 


-oy.4 


microtubule — > bud. neck 


-83.3 


cytoplasm^microtubule 


-69.0 


Maximum point 


-83.5 


Minimum point 


-69.2 


microtubule^cell. periphery 


-83.3 


microtubule^cytoplasm 


-69.4 


bud.neck^microtubule 


-78.7 


spindle, pole^cytoplasm 


-72.3 


microtubule^bud 


-78.7 


microti! bule^actin 


-74.3 


microtubule^spindle.pole 


-75.9 


microtubule^nucleus 


-75.0 


End 


-75.5 


spindle, pole^nucleus 







proteolog. We now momentarily pass through our starting point, log/o 2(g) = —75. So far, the proteologs 
from spindle pole, microtubule, actin and nucleus, in that order, may have formed as a result of irreversible 
reactions of the original cytoplasmic proteolog. Also, the proteologs of microtubule and spindle pole may have 
been subsequently partially degraded after their possible formation. 

Now, aslog/o 2(s) is increased above —74.3, the proteolog of spindle pole becomes unmetastable relative to that 
of microtubule. Above log/o 2(g) = — 69.4, the proteolog of bud neck may be formed irreversibly at the expense 
of that of microtubule. At our maximum log/o 2(s) = —69 this reaction can continue, but as we drop below 
log/o 2(s) = —69.2 it may be joined by formation of the proteolog of cell periphery. Below log/o 2(g) = —69.4 
any proteolog of bud neck that may have formed becomes unmetastable relative to that of microtubule. Below 
log /o 2(s) = —72.3 any proteolog of microtubule that remains may degrade in favor of formation of the proteolog of 
bud. Finally, as we drop past log /o 2(g) — —74.3 and return to our starting point of log /o 2(s) — — 75 the proteolog 
of spindle pole once again becomes relatively metastable instead of microtubule. In summary, at log/o 2(g) > —75 
the potential arises for formation of proteologs of the microtubule, bud neck, cell periphery, bud and spindle pole, 
as well as for retrograde reactions that may destroy the proteolog of microtubule. 

It is important to emphasize the qualified nature of these predictions; all we know from thermodynamics is 
that any of these reactions could have progressed in the direction of a local Gibbs energy minimum. Whether 
and to what extent they actually move forward is a consequence of the reaction mechanism. The purpose of 
this analysis is not to suggest any mechanism but to ask whether work performed by control of log/o 2(g) may 
energize such a mechanism. The enzymatic properties of the proteins themselves are probably essential in any 
actual mechanism. It is encouraging to observe that at and below the starting log/o 2(g) = ^75 the proteolog 
of endoplasmic reticulum is favored to form relative to the cytoplasmic proteolog. Hence under these conditions 
there exists a potential for production of biosynthetic enzymes. 

The results of this thought experiment are summarized in Table IBl The range of theoretical values of /o 2(g) 
required for the chemical transformations among the proteologs is between —83.5 and —69, which in terms of 
redox potential at 25 °C, 1 bar, pH = 7 and loga H2 o = correspond to Eh = -0.420V and Eh = -0.205V, 
respectively (Eqn. 11). The former value is just below the stability limit for water (log/o 2(g) = —83.1) but the 
redox state of the NADPH/IMADP+ pool in rat liver mitochondria might approach this value (Eh = —0.415V 
[86] ) . The latter value is consistent with the state of human cells during differentiation (Eh = —0.200V), which 
is about 0.040V higher than proliferating cells [100J. 

Oscillations in the redox state of yeast cells are coupled to many metabolic changes including protein tran- 
scription and turnover [101J. Reductive and oxidative phases in the metabolic cycle of yeast have been identified, 
with DNA replication occurring during the former and cell cycle initiation occurring at an advanced stage of the 
latter [102J. Oxidative stress was shown to hasten HeLa cells into anaphase by overcoming the normal spindle 
checkpoint mechanism [103J. Although the results shown in Table[5]do not directly address the synthesis of DNA, 
they do show that there is a potential for the formation of the nuclear proteolog during a relatively reducing part 
of the hypothetical /o 2(g) cycle. In the oxidizing part of this cycle, above log/o 2(g) = —74.3, the metastability of 
the proteolog for spindle pole is decreased, and at the highest oxidation-reduction potentials a favorable chemical 
potential field exists for metastable formation of the proteolog for bud neck. Hence, the notion that "a fundamen- 
tal redox attractor underpins ... core cellular processes" [104J is in principle supported by the changing relative 
metastabilities of the proteologs as a function of oxidation-reduction potential. 
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Figure 4: Metastable equilibrium abundances of model proteologs and proteins as a function of oxygen 
fugacity. Chemical speciation diagrams were generated as a function of log/o 2(g) at 25 °C and 1 bar and with 
total activity of protein residues equal to unity for (a) the proteologs shown in Table [I] and (b) the five proteins 
localized to ER to Golgi whose experimental abundances were reported in |105) . The rightmost dotted line in 
(b) indicates conditions where the calculated abundance ranking of the proteins is identical to that found in the 
experiments, and the leftmost dotted line where the calculated logarithms of activities have a lower overall deviation 
from experimental ones, which are indicated by the points. This value of log/o 2(s) (—78) was used to construct 
the corresponding diagram in Fig. [5] 



Calculation of relative abundances of proteins 

Above, the interactions between homologs (enzyme isoforms) in subcellular compartments and proteologs repre- 
senting overall protein compositions in subcellular compartments were used to derive oxygen fugacity limits for 
metastable reaction of proteins in different compartments. In the second part of this study, attention is focused 
on the relative abundances and intracompartmental interactions of proteins. 

The logarithms of activities of proteologs consistent with metastable equilibrium among all 23 model proteologs 
are plotted in Fig. j4a as a function of log/o 2(g) - This diagram was generated based on metastable equilibrium 
among the residues of the proteins [70J in the same manner as described in detail below for a smaller set of proteins 
(those appearing in Fig. |4|d). The purpose of Fig. [4^ is to recapitulate the relationships shown in Fig. [2] Note that 
the same proteins predominate at the extremes of oxygen fugacity represented in [4^ and in Fig. [T^ (reducing - ER; 
oxidizing - actin) and that the proteolog of microtubule appears with low relative abundance. More importantly, 
perhaps, there is a minimum in the range of calculated activities of the proteologs around log/o 2(g) = —75; 
changing oxidation-reduction potential alters not only the identity of the predominant protein in a metastably 
interacting population but also the relative abundances of all the others. There is probably not a single value of 
log/o 2(s) where the calculated relative abundances of the proteologs shown in Fig. [2| reflect the composition of 
the cell. Let us therefore look more closely at the relative abundances of proteins witmn compartments. 

In Fig. [4}} the relative abundances of the five model proteins localized exclusively to ER to Golgi are shown 
as a function of log/o 2(g) - A worked-out example of the calculations leading to this figure, which method also 
underlies the generation of the other figures shown here, is presented in the following paragraphs. 

The model proteins for ER to Golgi, in order of decreasing abundance in the cell reported by [105J, are 
YLR208W, YHR098C, YDL195W, YNL049C and YPL085W. (For simplicity, the proteins are identified here by 
the names of the open reading frames (ORF).) The formula of the uncharged form of the first protein, YLR208W, 
is Ci485H2274N4oo044gS4, and its amino acid sequence length is 297 residues. The standard molal Gibbs energy 
of formation from the elements (AG°) of this protein at 25 °C and 1 bar calculated using group additivity 
[69] is —10670 kcal mol -1 . At this temperature and pressure and at pH = 7, group additivity can also be 
used [69J to calculate the charge of the protein (—10.8832) and the standard molal Gibbs energy of formation 
from the elements of the charged protein ( — 10880 kcal mol -1 ). The formula of the protein in this ionization 
state is C1485H2263.1168N400O449SJ 10 ' 8832 . Dividing by the length of the protein, we find that the formula 
and standard molal Gibbs energy of formation from the elements of the residue equivalent of YLR208W are 
C5.0000H7.6i99N1.3468O1.51i8S0.0135 66 and -36.633 kcal mol -1 , respectively. 
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The formation from basis species of the residue equivalent of YLR208W is consistent with 

5.0000CO 2(Q9) + 1.7946H 2 + 1.3468NH 3(ag) + 0.0135H 2 S (ag) 

^ C5.0000H7.6199Ni. 3 468Oi.5118Sa013 3 5 66 + 5.14140 2(ff) + 0.0366H+ . (1) 
Similar reasoning can be applied to write the formation reaction of the residue equivalent of YHR098C as 

4.9720CO 2(Q(?) + 1.8708H 2 O + 1.3240NH 3(a<?) + 0.0441H 2 S (ag) 

^ C4.9720H7. 7 882N 1 . 32 40Oi.5 2 3lS a ^ 4 1 1 38 + 5.14640 2(g) + 0.0138H+ . (2) 

The double arrows signify that a priori one does not know the sign of the chemical affinity of either of these 
reactions. 

At 929 residues, YHR098C is over 3 times as long as YLR208W, but in the formation reactions from the basis 
species of the residue equivalents of the two proteins, the coefficients on the basis species are similar. The difference 
between the coefficients of the same basis species in the reactions signifies the response (owing to moderation, i.e. 
LeChatelier's principle [106J) of the metastable equilibrium assemblage to changes in the corresponding chemical 
activity or fugacity. For example, because ^co 2 [T] < U C0 2 ^ ^NHaJT] < ^nh 3 |2] and ^o 2 [T] < ^o 2 rj, increasing 
a co 2(a ,) ■ a NH 3(ag) or /o 2(s) at constant T, P and chemical activities of the other basis species shifts the metastable 
equilibrium in favor of YLR208W at the expense of YHR098C. Here, i/j denotes the reaction coefficient of the 
ith basis species or protein, which is negative for reactants and positive for products as written. Conversely, 
because ^h 2 o[T]> ^0(2} ^h 2 s[T]> ^H 2 srjand f H +|i]> ^H+rj increasing a H2 o. a H 2 s (a5) or a H + (decreasing pH) 
at constant T, P and chemical activities of the other basis species shifts the metastable equilibrium in favor of 
YHR098C at the expense of YLR208W. The magnitude of the effect is proportional to the size of the difference 
between the coefficients of the basis species in the reactions, and it can be quantified for a specific model system 
using the following calculations. 

To assess the relative abundances of the proteins in metastable equilibrium, we proceed by calculating the 
chemical affinities of each of the formation reactions. The chemical affinity (A) is calculated by combining the 
equilibrium constant (K) with the reaction activity product (Q) according to [107] 

A/2.303RT = log (K/Q) = log ( '^g 03 ^ ) , (3) 

where 2.303 is the natural logarithm of 10, R stands for the gas constant, T is temperature in degrees Kelvin, 
AG° is the standard molal Gibbs energy of the reaction, and and vi represent the chemical activity and reaction 
coefficient of the ith basis species or species of interest (i.e., residue equivalent of the protein) in the reaction. Let 
us calculate AG° (in kcal mol -1 ) of Reaction FT] by writing 



AGjf] = 1 x -36.633 + 5.1414 x + 0.0366 x 

- 5.0000 x -92.250 - 1.7946 x -56.688 

- 1.3468 x -6.383 - 0.0135 x -6.673 

= 535.036. (4) 

In Eqn. Q the values of AG° of 2 ( g ) and H + are both zero, which are consistent with the standard state 
conventions for gases and the hydrogen ion convention used in solution chemistry. The values of AG° of the 
other basis species are taken from the literature [1081 11091 IllOj . The value of log\Kjj] consistent with Eqn. Q is 
-392.19. 

We now calculate the activity product of the reaction using 

logQi]= 1 x + 5.1414 x -75.3 + 0.0366 x -7 

- 5.0000 x -3 - 1.7946 x - 1.3468 x -4 - 0.0135 x -7 

= -366.92. (5) 



The values of used to write Eqn. (|5j) are the reference values listed in the Methods for aco 2(Q<j) . a H 2 o. aNH 3(a5) . 
a H 2 s (a and a H +. The value of fo 2 used in Eqn. (|5j) (log/o,, = —75.3) is also a reference value that, it will 
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be shown, characterizes a metastable equilibrium distribution of proteins that is rank-identical to the measured 
relative abundances of the proteins. Finally, the value of a of the residue equivalent of the protein in Eqn. ([5]) is 
set to a reference value of unity (log a = 0). If we are only concerned with the relative abundances of the proteins 
in metastable equilibrium, the actual value used here does not matter so long as it is the same in the analogous 
calculations for the other proteins. 

Combining Eqns. @-([5j) yields A&2.303RT = -25.25 (th is is a non-dimensional number). Following the 
same procedure for the other four proteins (YHR098C, YDL195W, YNL049C and YPL085W) results in A/2.303RT 
equal to —24.86, —24.74, —24.93 and —24.94, respectively. Now let us turn to the relative abundances of the 
proteins in metastable equilibrium, which we compute using a Boltzmann distribution for the relative abundances 
of the residue equivalents: 

AjRT 

- = — a . (6) 

at v" e A */ RT 

where a t denotes the total activity of residue equivalents in the system and n stands for the number of proteins 
in the system. Note regarding the left-hand side of Eqn. ([6]) that because we are taking activity coefficients of 
unity, the ratio ai/at is equal to the ratio of concentrations, or proportionally numbers, of residue equivalents 
in the system. There is not a negative sign in front of A/RT in the exponents Eqn. ^ because the chemical 
affinity is the negative of Gibbs energy change of the reaction. Note in addition that the values of A/2.303-RT 
given above must be multiplied by In 10 = 2.303 before being substituted in Eqn. ([6]). By taking a t = 1, we can 
combine Eqn. ^ with A/RT of each of the formation reactions to calculate chemical activities of the residue 
equivalents of the proteins equal to 0.0905, 0.2248, 0.2994, 0.1944 and 0.1909, respectively. The lengths of 
the proteins are 297, 929, 1273, 876 and 2195, so the corresponding logarithms of activities of the proteins are 
e.g. log (0.0905/297) = -3.52 for YLR208W, and -3.61, -3.63, -3.65 and -4.06 for the remaining proteins, 
respectively. 

If one now iterates calculation of the chemical affinities of the residue formation reactions using the calculated 
metastable equilibrium logarithms of activities of the residue equivalents (instead of the starting reference value 
of log a = 0), the resulting chemical affinities for each formation reaction will be all equal and generally non-zero. 
This property of metastable equilibrium was used in [70J to describe specific application of a method using a 
system of linear equations for finding the metastable equilibrium state without explicitly writing Eqn. ([6]). 

The results of the calculation described above correspond to the dotted line at log/o 2(9) = —75.3 in Fig. 
|4}d. At this oxygen fugacity, the ranks of abundance of the model proteins in metastable equilibrium are identical 
to the ranks of experimental abundances. The figure was generated in whole by carrying out this procedure for 
different reference values of log/o 2(g) . It can be seen in Fig. |4a that there is a narrow range on either side of 
log/o 2(s) = —75.3 (ca. ±0.05) where the relative abundances or the proteins in metastable equilibrium occur in 
the same rank order. Beyond these limits, changing /o 2(s ) drives the composition of the metastable equilibrium 
assemblage to other states that do not overlap as closely with the experimental rankings. The experimental 
abundances of the proteins reported by [105J are 21400, 12200, 1840, 1720 and 358, respectively, in relative units. 
These abundances were scaled to the same total activity of residues (unity) used in the calculations to generate 
the experimental relative abundances plotted at the dashed line in Fig. |4Jd at log/o 2(g) = —78. Under these 
conditions, the metastable equilibrium abundances of the proteins do not occur in exactly the same rank order as 
the experimental ones, but there is a greater overall correspondence with the experimental relative abundances. 

Relative abundances of proteins within compartments 

The procedure outlined above for calculating the relative abundances of model proteins in ER to Golgi was repeated 
for each of the other compartments identified in [22J. Up to 50 experimentally most abundant proteins were chosen 
to model each of the compartments. The relative abundances of the proteins were calculated at 0.5 log unit 
increments from log/o 2(s) = —82 to —70.5. Scatterplots of the experimental vs. calculated relative abundances 
for each set of proteins are shown in Figure SI. These comparisons were visually assessed to regress values of 
log/o 2(s) , listed in Table |6j that yield the best fit between calculated and experimental relative abundances. The 
resulting calculated relative abundances are listed together with the experimental ones in Table S3; the best-fit 
scatterplots for each set of model proteins are shown in Fig. [5] 

The retrieval of optimal values of log/o 2(B) was aided by also calculating the root mean square deviation 



(RMSD) of logarithms of activities using Eqn. ( 13 ) and the Spearman rank correlation coefficient (p; Eqn. 14 1 



between experimental and calculated logarithms of activities. The dotted lines in Fig. [5]were drawn at one RMSD 
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Table 6: Oxygen fugacities, deviations and correlation coefficients in comparisons of intracompartmental protein 
interactions 3 . 
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a. Values of log/o 2(s) in each location were regressed by comparing calculated and experimental logarithms of 
activities of the most abundant proteins in different subcellular locations and of selected complexes for each location 
(Figure SI), n denotes the number of model proteins used in the calculations. RMSD values were calculated using 



Eqn. (13), and p denotes the Spearman rank correlation coefficient, calculated using Eqn. (14) 



on either side of the one-to-one correspondence, denoted by the solid lines in this figure. The RMSD values were 
used to identify outliers that are identified in Fig. [5] by letters and open symbols and that are listed in Table|7] To 
aid in distinguishing the points, they were assigned colors on a red-blue scale that denotes the average nominal 
oxidation state of carbon of the protein (Eqn. |l2| ). 

There is a considerable degree of scatter apparent in many of the plots shown in Fig. [5] so a low significance 
is attached with the log/o 3 , 3 values regressed from these comparisons. In specific cases such as late Golgi and 
nuclear periphery a lower overall deviation is apparent and there is a visual indication of a positive correlation 
between the calculated and experimental relative abundances. Because they were regressed from individual noisy 
data, the values of log/o 2(a) listed in Table[6jare probably not as representative of subcellular oxidation-reduction 
conditions as those listed in Table[3] which nave the additional benefit of being partly based on known subcellular 
interactions (see above). 

The comparisons depicted in Fig. [5]and in Figure SI are important because they reveal that the range of protein 
abundance observed in cells is accessible in a metastable equilibrium assemblage at some values of log/o 2(g) - For 
example, the range of experimental abundances of the model proteins in actin covers about 1.6 orders of magnitude, 
while the calculated abundances vary over about 2.2 orders of magnitude. Extreme values of log/o 2(g) tend to 
weaken this correspondence (Figure SI). The lowest degree of correspondence occurs for the cytoplasmic proteins, 
where ~ 6 orders of magnitude separate the predicted relative abundances of the top 50 most abundant proteins, 
which in the experiments have a dynamic range spanning about 1.2 orders of magnitude. The great degree of 
scatter apparent in many of the comparisons in Fig. [5^ is troublesome. The scatter could be partly a consequence 
of including in the comparisons model proteins that do not actually interact with each other, despite their high 
relative abundances. To address this concern, a more selective approach was adopted below that takes account of 
fewer numbers of proteins that interact through the formation of complexes. 
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Figure 5: Comparison of experimental and calculated logarithms of activities of proteins in compartments. 

Red and blue colors denote, respectively, low and high average nominal carbon oxidation states (Zq) of the protein. 
Dotted lines are positioned at one RMSD above and below one-to-one correspondence, which is denoted by the 
solid lines. Outlying points are labeled with letters that are keyed to the proteins in Table [7] The values of 
log/o 2( use d in the calculations are listed in Table [5| 
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a. Proteins are listed whose calculated logarithm of activity differs from experimental values by more than the 
root mean square deviation shown in Table [6] 

Relative abundances of proteins in complexes 

The correspondence between the calculated and experimental relative abundances of the five model proteins in ER 
to Golgi raises the question of what characteristics of the proteins might be responsible for this result. Searching 
the functional annotations of these proteins reveals that they are part of the COPII coat complex [lllj. The 
inclusion of the COPII complex above was largely unintentional, as the procedure there was to look at the most 
abundant proteins in given compartments. Nevertheless, the results for that model system suggested that focusing 
on specific complexes in other compartments could yield interesting results. Because the interactions of proteins 
to form complexes is essential in cellular structure and regulating functions of enzymes [51J, factors that affect the 
relative abundances of the complexing proteins may be fundamental to the control of metabolic processes. 

The model complexes used in this study are identified in Table[8] Each complex was nominally associated with 
a subcellular compartment based on the names and descriptions of the complexes available in the literature. Some 
exceptions are the cyclin-dependent protein kinase complex, the proteins of which are largely cytoplasmic and 
nuclear [22J, but here is placed in the slot for the ambiguous location because no definitely ambiguously localized 
complexes could be identified. For a similar reason, the proteins listed in Table [8] under punctate composite are 
not part of a named complex but were chosen because they are localized to early Golgi in addition to the punctate 
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Table 8: Model proteins in complexes 3 



Name 




ORF 




Name 




ORF 




Name 




ORF 




Name 




ORF 




1. actin: Arp2/3 complex (423) 


9. ER: si 


gnal recognition 




14. microtubule: 


DASH 




20. punctate. composite: proteins 










complex 


(52) 






complex 


EE] 






localized here and early.Golgi 












Sec65 


YML105C 


NA 


Daml 


YGR113W 


X 


Aril 




YBR164C 


d 


Arc 15 




YIL062C 


b 


Srpl4 




YDL092W 




Duol 


* 


YGL061C 


a 


Apm3 




YBR288C 




Arcl8 




YLR370C 




Srp54 




YPR088C 


X 


Dadl 


* 


YDR016C 




Bugl 




YDL099W 




Arcl9 




YKL013C 




Spp68 


* 


YPL243W 


a 


Dad2 


* 


YKR083C 


b 


Arfl 




YDL192W 




Arc35 




YNR035C 


a 


Srp72 




YPL210C 


b 


Spcl9 


* 


YDR201W 




Luvl 




YDR027C 


a 


Arc40 




YBR234C 


NA 


10. bK.to.Golg 


i: coatomer 




Spc34 


* 


YKR037C 


NA 


Tvp23 




YDR084C 




Arp2 




YDL029W 




COPII complex 


(340) 




Askl 


* 


YKL052C 




Dopl 




YDR141C 


a 


Arp3 




YJR065C 


X 


Secl3 




YLR208W 


a 


Dad3 


* 


VBR233W-A 




Keil 




YDR367W 




2. amb 


iguous: 


cyclin-dependent 


Secl6 




YPL085W 




Dad4 




YDR320C-A 




Vrg4 




YGL225W 




protein kinase 


complex (343) 




Sec23 




YPR181C 


X 


Hsk3 




YKL138C-A 


X 


Apl6 




YGR261C 




Cdc28 




YBR160W 


b 


Sfb2 




YNL049C 




lb. nuclear. periphery: nuclear 




Aps3 




YJL024C 


c 


Cksl 




YBR135W 


a 


Sec24 




YIL109C 


NA 


pore complex 24 






Vps53 




YJL029C 


NA 


Cln2 




YPL256C 




Grhl 




YDR517W 




Nup60 




YAR002W 




Tvp38 




YKR088C 




Cys4 




YGR155W 




11. Golg 


i: Gol^ 


;i transport 




Nupl70 




YBL079W 




Sspl20 




YLR250W 




Sicl 




YLR079W 




complex 


(293) 




Asm4 




YDL088C 


a 


NA 




YMR010W 




Clb3 




YDL155W 




Cogl 


* 


YGL223C 




Nup84 




YDL116W 




NA 




YMR253C 


NA 


Clnl 




YMR199W 




Cog2 




YGR120C 


b 


Glel 




YDL207W 




Kex2 




YNL238W 


NA 


3. bud: 


actin- 


associated motor 


Cog3 




YER157W 




Nup42 




YDR192C 


X 


Mon2 




YNL297C 




protein 


complex 2 (49) 




Cog4 


* 


YPR105C 




Nupl57 




YER105C 


c 


21. spindl 


e.pole: spindle-pole 




11151 




Cog5 




YNL051W 


c 


Gle2 




YER107C 




body com 


plex (219) 116 




Myo2 




YAL029C 




Cog6 




YNL041C 




Nic96 




YFR002W 


g 


Pfkl 


* 


YGR240C 


a 


She4 




YKL130C 


b 


Cog7 




YGL005C 




Nupl45 




YGL092W 




Spc72 




YAL047C 




Mlcl 




YBR130C 




Cog8 




YML071C 




Sehl 




YGL100W 


X 


Spc97 




YHR172W 




Myol 




YGL106W 


X 


Imll 


* 


YJR138W 




Nup49 




YGL172W 


j 


Spc98 




YNL126W 




Cmdl 




YKL007W 




Nrpl 




YDL167C 


a 


Nup57 




YGR119C 




Tub4 




YLR212C 


b 


Myo5 




YIL034C 


a 


12. late.Golgi: 


retrograde 




Nupl59 




YIL115C 




22. vacuo 


lar. 


membrane: VU 




4. bud. neck: septin complex 


(333) 


protein complex (114) 




Nupl92 




YJL039C 




vacuolar ATPase complex (14) 




117 






118 








Nspl 




YJL041W 


i 


Emi2 


* 


YDR516C 




Bud4 




YJR092W 


a 


Kar2 


* 


YJL034W 


c 


Nup82 




YJL061W 


f 


Vma6 




YLR447C 




CdclO 




YCR002C 


c 


Vps52 




YDR484W 




Nup85 




YJR042W 




Vph2 




YKL119C 


a 


Cdcll 




YJR076C 




Vps53 


* 


YJL029C 


NA 


Nupl20 




YKL057C 


X 


Bnil 


* 


YNL271C 


b 


Cdcl2 




YHR107C 




Vps54 




YDR027C 


a 


NuplOO 




YKL068W 


h 


Drs2 




YAL026C 




Cdc3 




YLR314C 


X 


Vps51 


* 


YKR020W 


b 


Nupl33 




YKR082W 




Gaal 


* 


YLR088W 


NA 


Shsl 




YDL225W 


b 


Scil 




YMR214W 




Pom34 




YLR018C 


NA 


Lys9 




YNR050C 




Mdhl 




YKL085W 




13. lipid. particle: sterol 




Ndcl 




YML031W 




Nop6 




YDL213C 


c 


5. cell. 


periphery: exocyst 




biosynth* 


2sis enzymes 




Nupl88 




YML103C 


e 


Pdcl 


* 


YLR044C 




complex (120) 




f!19l 








Nupll6 




YMR047C 


NA 


Pgil 




YBR196C 




Exo84 




YBR102C 


NA 


Erg9 


* 


YHR190W 




Poml52 




YMR129W 


b 


Vac8 




YEL013W 




SeclO 




YLR166C 




Ergl 




YGR175C 




Nup53 




YMR153W 




VmalO 




YHR039C-A 


d 


Sec3 




YER008C 


b 


Erg7 




YHR072W 


c 


Nupl 




YOR098C 


d 


Vma2 




YBR127C 




Sec5 




YDR166C 


a 


Ergll 


* 


YHR007C 




Cdc31 




YOR257W 


X 


Vma7 




YGR020C 




Sec6 




YIL068C 




Erg24 




YNL280C 




17. nucleolus: small subunit 




Vphl 




YOR270C 




Sec8 




YPR055W 


NA 


Erg25 


* 


YGR060W 




processome (70) 






Vtc4 




YJL012C 


X 


b. cytoplasm: 


translation 




Erg26 




YGL001C 




120 








Yorl 




YGR281W 




initiation factor elF3 (45) 




Erg27 




YLR100W 


NA 


Utp8 




YGR128C 




Yral 


* 


YDR381W 


NA 


Funl2 




YAL035W 




Erg6 




YML008C 


d 


Nanl 




YPL126W 


b 


23. vacuole: 


vacuolar proteases 




Hcrl 




YLR192C 


c 


Erg2 




YMR202W 




UtplO 




YJL109C 


a 


and other 


canonical proteins 




Nipl 




YMR309C 


a 


Erg3 


* 


YLR056W 


a 


Utpl5 




YMR093W 




EU 








Prtl 




YOR361C 




Erg5 




YMR015C 




Utp4 




YDR324C 




Apel 




YKL103C 


b 


Rlil 




YDR091C 




Erg4 




YGL012W 


b 


Utp9 




YHR196W 




Ape3 




YBR286W 




Rpgl 




YBR079C 




15. mitochondi 


'ion: mitochor 


idrial 


18. nucleus: RNA 




Lap3 


* 


YNL239W 




Tif34 




YMR146C 


X 


ribosome 


: small 


subunit (9) 




polymerase 1 (30) 




Pep4 




YPL154C 


NA 


Tif35 




YDR429C 


NA 


Ehd3 




YDR036C 


f 


Rpa49 


* 


YNL248C 


NA 


Prbl 


* 


YEL060C 




Tif5 




YPR041W 


b 


Mrpl3 




YGR084C 


a 


Rpal2 




YJR063W 




Prbl 




YMR297W 




7. earl} 


'.Golgi: 


SNARE compl 


ex 


Mrpl7 




YKL003C 


NA 


Rpal90 




YOR341W 




Amsl 


* 


YGL156W 


a 


(113) ; 


mi 






Mrp21 




YBL090W 


h 


RPApa3 




YOR340C 




Athl 




YPR026W 


X 


Dsll 


* 


YNL258C 


a 


Mrp4 




YHL004W 




Rpc40 




YPR110C 


a 


Pho8 




YDR481C 




Sec39 




YLR440C 




Mrp51 




YPL118W 




Rpal35 


* 


YPR010C 




Vtc4 




YJL012C 


X 


Tip20 




YGL145W 




Mrpsl6 




YPL013C 


NA 


Rpb5 




YBR154C 


X 


Ypt7 




YML001W 


c 


Ufel 




YOR075W 


NA 


Mrpsl7 




YMR188C 


c 


19. peroxisome: i 


ntegral to 




Npc2 




YDL046W 


d 


Usel 




YGL098W 




Mrpsl8 




YNL306W 




peroxisomal membrane 




NA 




YHR202W 


NA 


Pepl2 




YOR036W 


X 


Mrps28 




YDR337W 


e 


(GO:0005779) 














Ykt6 




YKL196C 


X 


Mrps5 




YBR251W 


d 


Antl 




YPR128C 












8. endosome: 


ESCRI 1 Hi II 




Mrps8 




YMR158W 


X 


Inp2 




YMR163C 












complexes ( 122 ; 




Mrps9 




YBR146W 


X 


Pexl2 




YMR026C 












123 ) 








Petl23 




YOR158W 


g 


Pexl5 




YOL044W 












Vps23 




YCL008C 


X 


RsmlO 




YDR041W 


b 


Pex22 




YAL055W 


b 










Vps28 




YPL065W 




Rsml9 




YNR037C 


X 


Pex3 




YDR329C 












Vps37 




YLR119W 


X 


Rsm22 




YKL155C 




Pex30 




YLR324W 


c 










Mvbl2 




YGR206W 


a 


Rsm23 




YGL129C 




Pex31 




YGR004W 


a 










Vps22 




YPL002C 


b 


Rsm27 




YGR215W 


NA 


Pex32 




YBR168W 


NA 










Vps36 




YLR417W 




Rsm7 




YJR113C 




Pxal 




YPL147W 


X 










Vps25 




YJR102C 


X 


Mrpl 




YDR347W 




Pxa2 




YKL188C 


X 


















Rsm25 




YIL093C 




























Nam9 




YNL137C 





















a. Numbers in parentheses refer to the ID of the complex, if available, from http: //yeast- complexes . embl . de] 



[124J. Compositions and localizations of complexes were also taken from references listed in square brackets. 
Symbols: "*" the protein was not localized in the compartment |22J; "X" or "NA" not tagged or no abundance 
[105J; "a", "b", etc. refer to outliers in Fig. [7] 
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composite characterization [22J. Other exceptions are the vacuolar model proteins (proteases and other canonical 
vacuolar proteins [12J), enzymes of the ergosterol biosynthetic pathway, some of which are associated with the 
lipid particle [119J, and proteins integral to the peroxisomal membrane, which were identified using the Gene 
Ontology (GO) annotations in the SGD [lllj. Where they could be found, the ID numbers of the complexes in 
a yeast complex database |124j are listed in parentheses in Table [8] as are literature references that describe the 
composition and/or localization of the complexes. If any of the proteins in the complexes do not localize |22j 
to the compartment shown in Table [8] they are marked with an asterisk; those proteins that were not present 
in the YeastGFP database or that are lacking an abundance count therein |105] are marked with "X" and "NA", 
respectively. 

The calculated metastable equilibrium logarithms of activities of the proteins in each complex are shown as 
a function of log/o 2(g) in Fig. |6j The calculated logarithms of activities of the proteins were compared with 
experimental ones by constructing scatterplots at 0.5 log unit intervals from log/o 2(g) = —82 to —70.5, which 
are shown in Figure SI. As above, visual assessment of fit was the first resort to obtain values of log/o 2(g) 
that maximize the correspondence with experimental relative abundances, but the RMSD and Spearman rank 
correlation coefficient were also considered in these comparisons. Because of the small sample size in many of 
the comparisons, the sign of the correlation coefficient is as useful as its magnitude in assessing the results. The 
resulting calculated relative abundances are listed together with the experimental ones in Table S4. 

The number of model proteins in each of the complexes is less than the number of most abundant proteins in 
each compartment considered in the preceding section, so the visible decrease in scatter is expected. Some of the 
model complexes represented in Fig. [TJexhibit an apparent positive correlation between calculated and experimental 
logarithms of activities; these include translation initiation factor elF3, nuclear pore complex and proteins integral 
to peroxisomal membrane. An inverse correlation between calculated and experimental logarithms of activities is 
apparent for proteins in the ESCRT I & II complexes, signal recognition complex, and DASH complex. A few of the 
other complexes (Golgi transport complex, sterol biosynthesis enzymes) exhibit very little overall correspondence 
between calculated and experimental logarithms of activities. 

The results in Fig. [7J permit an interpretation of the relative energetic requirements for formation of different 
groups of interacting proteins. Take for example complex 14, which is the DASH complex that associates with 
the microtubule. An inverse correlation between the experimental and calculated relative abundances is apparent 
for this complex in Fig. [7] The RMSD between calculated and experimental logarithms of activities of proteins is 
1.05, which is among the highest listed in Table|6] Note from Eqn. ^ that a ~ 1 log unit change in the chemical 
activity of a chemical species corresponds to a Gibbs energy difference equal to 2.303i?T. An average difference 
of ~ 1 between calculated and experimental logarithms of activity indicates that the formation of the proteins 
requires 2.303i?T = 1364 cal mol -1 per protein beyond what would be needed if the proteins formed in metastable 
equilibrium relative abundances. On the other hand, the formation in specific oxidation-reduction conditions of 
proteins making up translation initiation factor elF3 and other assemblages where cellular abundances positively 
correlate with and span the same range as the metastable equilibrium distribution can proceed close to a local 
minimum energy required for protein formation. 

Because of their relatively high energy demands, proteins in complexes such as the DASH complex and the 
spindle pole body are likely to be more dynamic in the cell. Although a positive rank correlation coefficient 
for the latter complex is reported in Table [6] at a higher oxygen fugacity (log/o 2(9) = —76) a strong inverse 
correlation obtains between experimental abundances and calculated metastable equilibrium relative abundances 
of the proteins in this complex (Figure SI). The finding made elsewhere of some inverse relationships between 
relative abundance of proteins and corresponding mRNA levels was also interpreted as evidence for additional 
effort on the part of the cell [125J. An inverse relationship that opposes equilibrium may be favored in evolution 
because of the strategic advantage of incorporating otherwise costly (rare) amino acids that increase enzymatic 
diversity [126J. The present results show that specific examples of inverse relationships in the relative abundances of 
proteins can be identified using a metastable equilibrium reference state that is conditioned by oxidation-reduction 
conditions. Chemical selectivity in the dynamic formation in the cell of high-energy proteins could lead to transient 
formation of complexes that function only under certain conditions. In contrast, complexing proteins that interact 
close to metastable equilibrium are more likely to be constitutively formed. 
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Figure 6: Calculated logarithms of activities of model proteins in complexes. The numbered complexes are 
identified in Table [8] Metastable equilibrium activities of proteins in the complexes were calculated as a function 
of log/o 2(B) for total activity of residues set to unity. Dotted red lines denote values of log/o 2(s) (listed in Table 
[6]) and calculated relative abundances that were used in making Fig. [7] 
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Figure 7: Comparison of experimental and calculated logarithms of activities of interacting proteins. 

Symbols are as in Fig. [5j the model proteins and the outliers are listed in Table [8] 
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Concluding Remarks 



This study was concerned with thermodynamic selectivity of protein formation primarily as a function of one 
variable: oxidation-reduction potential represented by the logarithm of the fugacity of oxygen (log/o 2(s) ). In reality, 
many variables are changing in cells, including the hydration state, pH, activity of CO2 and H2S, temperature 
and pressure. These all factor into the Gibbs energy changes accompanying the overall chemical transformation 
between proteins. Except for oxygen fugacity, the other variables were held constant in most of the calculations 
reported here. It is tempting to explore the effects of these variables on the compositions of metastable equilibrium 
assemblages. Incorporation into the framework of protein folding reactions and a non-ideality contribution, or 
excess Gibbs energy, that would encompass the effects of electrostatic interactions and macromolecular crowding 
is another target for expanding the scope of the thermodynamic characterizations. 

The model results reported above were chosen in order to test specific predictions made using the hypothesis 
that the selection for or against metastable equilibrium has measurable consequences in organisms. The findings 
can be summarized as: 

1. The oxidation-reduction potential (log/o 2 , .) limits of relative metastabilities of redoxin isoforms overlap 
with measured Eh (redox potential) in the cytoplasm and mitochondrion but not the nucleus. 

2. The model proteologs represent the overall amino acid compositions of proteins in different compartments. 
At relatively low oxidation-reduction potential, proteologs in order of decreasing relative metastability are 
those of ER, Golgi, cell periphery, mitochondrion, nuclear periphery and spindle pole. At higher oxidation- 
reduction potential, proteologs in order of decreasing relative metastability are those of actin, nucleolus, 
nucleus, vacuole, bud neck and microtubule. At intermediate oxygen fugacities, proteologs of lipid particle, 
peroxisome and early Golgi are relatively metastable compared to those of cytoplasm, vacuolar membrane 
and late Golgi. 

3. In a chemically reacting system starting with the cytoplasmic proteolog where all interactions include the 
proteologs of microtubule or spindle pole, environmental shifts in log/o 2(s) going from —75 to —83.5 to 
—69 to —75 can drive the sequential formation of proteologs of spindle pole, microtubule, cytoplasm, actin, 
nucleus, cell periphery, bud neck and bud. 

4. Oxidation-reduction potentials within -78 < log/o 2(g) < -74 give rise to metastable equilibrium popu- 
lations of most abundant model proteins within compartments in which the range of protein abundance 
becomes closest to that seen in reported measurements. Substantial scatter is evident in the comparisons, 
but a moderate overall positive rank correlation was observed. 

5. Closer fits between calculated and experimental relative abundances were obtained within -80 < log fo 2(g) < 
—73 by considering fewer numbers of model proteins that interact in complex formation. Strong positive 
correlations were found for, among others, cytoplasmic translation initiation factor elF3 and nuclear pore 
complex; negative correlations were found for the microtubule-associated DASH complex and the endosomal 
ESCRT I & II complexes. 

This study contributes to understanding the products of evolution by quantifying the extent of departure from 
metastable equilibrium in populations of interacting proteins. The observed positive correlations are consistent 
with a trend of some populations of interacting proteins to be imprinted with the consequences of local energy 
minimization in chemical reactions. These results and observations also support the notion that changing oxidation- 
reduction potential can selectively promote or hold back the reactions leading to formation of complexing proteins 
in relative abundances seen in the cell. Combining proteomic data with metastable equilibrium calculations is 
therefore a promising avenue for predicting complexes that form in specific oxidation-reduction conditions that 
vary temporally and spatially in biochemical systems. 

Methods 

The essential steps in the calculations reported here are 1) defining standard states, 2) identifying model proteins for 
systems of interest, 3) assessing the relative abundances of model proteins in metastable equilibrium, 4) visualizing 
the results of the calculations on speciation or predominance diagrams and 5) comparing the computational results 
with experimental biochemical and proteomic data. 
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Standard states and chemical activities 



The activity of a species is fundamentally related to the chemical potential of the species by 

H = fi° + iff In a, (7) 

where R and T represent, respectively, the gas constant and the temperature, [i and /u° stand for the chemical 
potential and standard chemical potential, respectively, and a denotes activity. No provision for activity coefficients 
of proteins or other species was used in this study; under this approximation, the activity of an aqueous species is 
equal to its concentration (molality). 

The standard state for aqueous species including proteins specifies unit activity of the aqueous species in 
hypothetical one molal solution referenced to infinite dilution. The standard molal Gibbs energies of the proteins 
were calculated with the CHNOSZ software package [715] using group additivity properties and parameters taken 
from [55]. 

Proteologs: overall compositions of proteins in compartments 

The overall amino acid compositions of proteins in 23 subcellular locations in S. cerevisiae were calculated by 
combining localization [22J and abundance [105J data for proteins measured in the YeastGFP project with amino 
acid compositions of proteins downloaded from the Saccharomyces Genome Database (SGD) [lllj. Of 4155 
ORF names listed in the YeastGFP dataset, all but 12 are present in SGD (the missing ones are YAR044W, 
YBR100W, YDR474C, YFL006W, YFR024C, YGL046W, YGR272C, YJL012C-A, YJL017W, YJL018W, YJL021C 
and YPR090W). 

To generate proteologs that are most representative of each compartment, proteins that were annotated in the 
YeastGFP study as being localized to more than one compartment were excluded from this analysis (except for bud; 
see below), as were those for which no abundance was reported. The names of the open reading frames (ORFs) cor- 
responding to the proteins in the YeastGFP data set were matched against the SGD's protein_properties .tab 
file downloaded on 2008-08-04. This search yielded a number of model proteins for each compartment, ranging 
from 5 (ER to Golgi) to 746 (cytoplasm); see Table[3j The names of the compartments used throughout the tables 
and figures in this paper correspond to the notation used in the YeastGFP data files (where spaces are replaced 
with a period). 

It was found that no proteins with reported abundances and localized to the bud were exclusive to that 
compartment, hence all of the proteins localized there (which also have localizations in other compartments) were 
taken as models for the bud proteolog. The amino acid composition of the proteolog for each compartment 
was calculated by taking the sum of the compositions of each model protein for a compartment in proportion 
to its fractional abundance in the total model protein population of the compartment. The resulting amino 
acid compositions are listed in Table SI. The corresponding chemical formulas of the nonionized proteologs and 
the calculated standard molal Gibbs energies of formation from the elements at 25 °C and 1 bar of the ionized 
proteologs are shown in Table [3] 

Metastability calculations 

Diagrams showing the predominant proteins and the relative abundances of proteins in metastable equilibrium were 
generated using the CHNOSZ software package [70]. These calculations take account of formation reactions of 
the proteins written for their residue equivalents [70J. This approach is demonstrated in the Results for a specific 
model system. 

The basis species, or perfectly mobile components of an open system [61J, appearing in the formation reactions 
studied here are C02( og ), H2O, NH3( Q9 ), 02( g ), H2S( ag ) and H + . The reference activities used for the basis 
species were 10~ 3 , 10°, 10~ 4 , 10~ 7 and 10~ 7 , respectively, for C0 2 ( ag ), H 2 0, NH 3(ag) , H 2 S (Qg) and H+. In 
the case of diagrams showing Eh as a variable, the aqueous electron (e~) was substituted for 02( g ) in the basis 
species. Reference values for a c - or /o 2(g) are not listed here because one or the other is used as an independent 
variable in each of the calculations described above. 
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Conversion between scales of oxidation-reduction potential 



Conversion between the log/o 2(9) and Eh scales of oxidation-reduction potential can be made by first writing the 
half-cell reaction for the dissociation of H2O as 

H 2 0^ l -0 2(g) +2R+ + 2e- . (8) 

Taking pH = — log an+ and pc = — log a e - , the logarithmic analog of the law of mass action for Reaction [8] can 
be written as 

logi!jsi= - log/o 2(3) - 2pH - 2pe - loga H2 o , (9) 

where log stands for the logarithm of the equilibrium constant of Reaction [8] as a function of temperature and 
pressure. Eh is related to pc by [127J 

pe= Eh, (10) 

where F and R denote the Faraday constant and the gas constant, respectively. Combining Eqns. ^ and ( 10 ) 
yields the following expression for Eh as a function of log/o 2(s) and other variables: 

2 303RT ( 1 \ 
Eh = — ( - log /o 2(g) - 2pH - log a H2 o - log 1%) • (11) 

At 25°C and 1 bar, F/2.303RT = 16.903 volt" 1 and log%= -41.55; for pH = 7 and loga H2 o = 0, a value of 
Eh = V corresponds to log/o 2(B) = —55. Eqn. (11| permits the conversion between Eh and log/o 2(B) as well 
at other temperatures, pHs, and activities of H2O. 

Average nominal oxidation state of carbon 

Let us write the chemical formula of a species of interest as C nc H„ H N nN O no S^ , where Z denotes the net charge. 
The average nominal oxidation state of carbon (Zc) of this species is given by 

— Z - n H + 2 (n + n s ) + 3n N 

= ■ (12J 



Eqn. (12) is consistent with the electronegativity rules described in [128j and is compatible with the equation 



for average oxidation number of carbon used in [129J. For example, Eqn. (12) can be used to calculate the 
average nominal oxidation states of carbon in CO2 and CH4, which are +4 and —4, respectively. Note that 
the proportions of oxygen and other covalently-bonded heteroatoms contribute to the value of Zq of a protein 
or other molecule, but that proton ionization does not alter the nominal carbon oxidation state, because of the 



opposite contributions from Z and nn in Eqn. (12). In the 4143 proteins identified in the YeastGFP subcellular 
localization study and found in the Saccharomyces Genome Database, the minimum and maximum of Zq are 
-0.414 and 0.390, respectively. Of the proteins in this dataset, six have Z c < -0.35 (YDR193W, YDR276C, 
YEL017C-A, YJL097W, YML007C-A, YMR292W) and six have Z c > 0.15 (YCL028W, YHR053C, YHR055C, 
YKR092C, YMR173W, YPL223C). The points in the scatterplots in this paper (Figs. [5]and[7]and Figure SI) are 
colored on a continuous red-blue scale according to the value of Zq of the proteins, where maximum red occurs 
at Zq = —0.35 and maximum blue occurs at Zq = 0.15. 



Comparison with experimental relative abundances 

In comparison, experimental abundances of proteins in each model system were scaled so that the total chemical 
activity of residues was equal to unity. 

The root mean square deviation between calculated and experimental logarithms of activities was calculated 
using 
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RMSD = J ^ (Xcalc -' ^p*-') 2 , (13) 
V n 

where X ca \ c ,i and X cxpty i denote the calculated and experimental logarithms of activities and n stands for the 
number of proteins. x 

The Spearman rank correlation coefficient (p) was calculated using 

P=l~ , f 7T , (14) 

where d = Yl7=i (^caic.j — ^cxpt,i) 2 anc ' ^caic.i and x cxpt i stand for the ranks of the corresponding logarithms of 
activities. 



Supporting Information 

Figure SI: Comparisons of relative abundances of proteins (PDF) 

Scatterplots of experimental vs. calculated abundance ranking and logarithm of activity of most abundant proteins 
and selected complexes in subcellular compartments are shown as a function of oxygen fugacity. 



Table SI: Amino acid compositions of model proteologs (CSV) 

Overall amino acid compositions of proteins in subcellular locations of S. cerevisiae were calculated from YeastGFP 
localization [22\ and abundance [105J data downloaded from http://yeastgfp.ucsd.edu/ combined with protein 
compositions downloaded from the Saccharomyces Genome Database (http://www.yeastgenome.org/). The 
amino acid compositions of the proteologs were used to calculate the properties listed in Table[3] 

Table S2: Intercompartmental protein reactions (TXT) 

This table lists chemical reactions between residue equivalents of proteologs for interacting compartments. The 
charges of the proteologs were calculated at 25 °C, 1 bar and pH = 7. 

Table S3: Abundance data for model proteins in compartments (CSV) 

For the up to 50 most abundant model proteins in each compartment are listed the ORF name, sequence length, 



average nominal oxidation state of carbon (Eqn. 12), computed standard molal Gibbs energy at 25 °C and 1 bar 



of the ionized protein and charge at pH = 7 and calculated and experimental logarithm of activity. 

Table S4: Abundance data for protein complexes (CSV) 

For the model complexes in each compartment (see Table [8]) are listed the same properties as in Table S3. 



Text SI: CHNOSZ software package (GZ) 

This is the complete package (source code, documentation and data files) for the CHNOSZ program, which was 
used together with the program script (below) to perform the calculations in this study. The package is designed to 
be used with the R software environment http://www.R-project.org, Additional information about CHNOSZ 
is available in [70J and at http://www.chnosz.net 
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Text S2: Program script and data files for generating figures (GZ) 



This program script and supporting files were used to generate the figures shown above. It includes the script itself 
(plot.R), protein compositions (generated from the protein_properties.tab file downloaded from the Saccharomyces 
Genome Database), calculated standard molal thermodynamic properties of the proteins (to speed up calculations), 
YeastGFP protein localization and abundance data [2211105] . and a .csv version of Table[6j To generate the figures, 
the contents of the zip file should all be placed into the R working directory before loading CHNOSZ. Then read 
in the script with source ('plot.R,'). More details on the operation are provided at the top of the script file. 

Text S3: Interactions between subcellular compartments in yeast (PDF) 

This file lists statements from [57] [54] [55] |53] [55] [57J used to identify the interactions between proteins in different 
compartments of Saccharomyces cerevisiae that are listed in Table [4] 
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